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Abstract 

Although histogram methods have been extremely effective for analyz- 
ing data from Monte Carlo simulations, they do have certain limitations, 
including the range over which they are valid and the difficulties of com- 
bining data from independent simulations. In this paper, we describe 
an complementary approach to extracting information from Monte Carlo 
simulations that uses the matrix of transition probabilities. Combining 
the Transition Matrix with an N-fold way simulation technique produces 
an extremely flexible and efScient approach to rather general Monte Carlo 
simulations. 
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1 Introduction 



This paper presents both an approach to analyzing the information contained 
in configurations generated by general Monte Carlo simulations, and a closely 
related method of simulation that provides great flexibility and surprising effi- 
ciency. The approach uses information contained in the configurations about 
the set of possible changes on the next Monte Carlo step, which we encode in a 
"Transition Matrix." [|j This information is complementary to a histogram 
analysis lliot ' Indeed, it was originally developed as an improvement to the 
histogram approach. The discovery that the transition matrix, by itself, had 
definite advantages over the older methods has opened up the development of 
several new techniques. 

Combining the Transition Matrix with an N-fold way simulation technique 
produces an extremely flexible and efficient approach to rather general Monte 
Carlo simulations. Because the same information is needed for the simulation 
and the analysis, the two methods work extremely efficiently together. In par- 
ticular, the fact that this information is updated and used for every step of the 
N-fold way simulation enables contributions to the transition matrix to be made 
at every MC step, instead of after every sweep, as in cluster simulations. 

Since no use is made of cluster methods that could be limited to non- 
frustrated systems, the techniques described in this paper are extremely general. 
They can be used directly with any system that has equally spaced energy levels, 
and, as is the case for histograms, can be generalized to systems with continuous 
symmetry with the use of binning. 

In the following sections, we will define the transition matrix, describe how 
thermodynamic information is extracted from it, describe the N-fold way (as it is 
used with the transition matrix), describe the creation of generalized ensembles, 
and discuss the advantages of using these methods as part of parallel simulations 
similar to Replica Monte Carlo or Parallel Simulated Tempering. 



We will use the two-dimensional Ising model to demonstrate the transition ma- 
trix approach, although generalizations to higher dimensions and more com- 
plicated models is trivial. In particular, the generalization to a spin glass is 
obvious. The Ising model is given by the Hamiltonian 



where Oi takes on the values ±1. 

Although transition matrices can be defined for any kind of Monte Carlo 
simulation that depends only of the energies of the initial and final states, the 
primary transition matrix is defined in terms of the standard Monte Carlo dy- 
namics at infinite temperature. For this dynamics, each spin is chosen with 
equal probability and "flipped" with probability one. 



2 The Transition Matrix 




(1) 
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For every spin in a configuration, it is easy to calculate the change in energy 
of the configuration when that spin is reversed by simply counting the number of 
neighbors with the same sign. Denoting the number of sites that will produce 
an energy change SE by use, we define the "infinite temperature transition 
matrix" by 

Te,se = {nsE)E /N, (2) 

where N is the total number of sites and the average is taken over all configu- 
rations at energy E. 

The largest eigenvalue of this matrix is unity, and the corresponding eigen- 
vector is the density of states, W{E). 

J2 W{E - 5E)Te-5e,5e = W{E) (3) 

&E 

The elements of the transition matrix must satisfy the condition of detailed 
balance, 

W{E)Te.se = W{E + 5E)Te+se,-se, (4) 
which requires the "TTT-idcntity" , 

l£;,AT!E+A,Ar_E+2A,-2A = TB,2AT!E+2A,-Ar_E+A,-A, (5) 

where A is the smallest allowable energy change. 

For Potts models, the Ising model in three-dimensions, or other models with 
a greater number of possible energy changes, there are additional TTT-identities 
of the form, 

^E,(m-l)A^B-|-(m-l)A,A^B-|-mA,-mA = 7£;,mA2£;-FmA,-A^B-|-(m-l)A,-(m-l)A- 

_ (6) 

The situation is only slightly more complicated if other quantities, like the 
magnetization or second-neighbor interactions, are introduced. In each case, 
the introduction of an additional energy change creates two new elements at 
each energy {+5E and —6E), along with one new identity (possibly involving 
four matrix elements), leaving one new independent variable. 

Transition matrices corresponding to general Monte Carlo dynamics, Te,se, 
can be constructed from the acceptance probabilities, 

f ^ f Te,se X aE,SE fovSE ^ , , 

^'^^ ~ \ TE,sEXaE,5E + ^5E^oTE,SEx{l-aE,SE) iovSE = ' 

The corresponding probability distribution can then be constructed from the 
leading eigenvector of this matrix. 

For example, the standard Metropolis acceptance probabilities are given by 

aE,SE = min [1, exp {-5E/kT)] , (8) 

and the leading eigenvector of the corresponding transition matrix is the usual 
canonical probability distribution. 
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3 The N-Fold Way 



Instead of choosing spins and determining whether to accept each move, it is 
possible to use the set of numbers, {use}, to determine in advance what the 
probabiHty of picking a spin that wiU change the energy by SE and calculating 
its probability of acceptance. A class of spins with SE is then picked with 
probability 

where 

SE 

A spin from that class is chosen and flipped. The contributions to the transition 
matrix (and any other quantity being computed) are weighted with a factor of 

a/A). 

Since the set of numbers {use} are updated at every step for both the 
simulation and the transition matrix, little extra computer time is needed to 
record contributions to the transition matrix at every Monte Carlo step. 

It is clear that any acceptance rates may be used in defining an ensemble for 
Monte Carlo simulations, as long as detailed balance is satisfied. Because of the 
reweighting of the contributions of each configuration, it is not even necessary 
for the acceptance rates to be normalized. 



4 Generalized Ensembles 

To create an ensemble with a desired probability distribution, P{E), a necessary 
condition is 

P{E) X Te,5e X aE,SE = P{E + 5E) x Te+se-se x aE+SE-SE- (H) 

For example, the multicanonical ensemble pl is characterized by P{E) = P{E+ 
6E), so that the condition on the acceptance rates is 

aE,SE _ Te+se,-se ^^^2") 

aE+SE,~5E Te,SE 

Eqn. ^ relates this to the usual condition that the ratios of acceptance rates is 
equal to the ratio of the densities of state at the two energies. 

Eqn. |l^ is not sufficient to determine the acceptance rates uniquely. In 
addition to the usual acceptance rates given by the minimum of the ratio in 
Eqn. ^ or unity, both 

aE^&E ~ Te+se,-se (13) 

and 

aE.SE = l/rs,5_E (14) 
are valid. Many other options exist. 
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5 The "Equal-Hit" Ensemble 



Although the multicanonical ensemble was designed to visit every energy level 
with equal probability, that is not really optimal when using the N-fold way. 
For low energies, the aecoptancc ratio is small and the N-fold way achieves a 
uniform P{E) by visiting an energy level very few times, but weighting it with 
a large factor (A^^)^ ^j.. To scan all energy levels equally, it would be more 
appropriate to specify that the number of visits to each energy level, H{E), is 
uniform. Note that the average of over the visits to energy level, E, with the 
N-fold way is not the same as the microcanonical average over configurations 
with that energy. Denoting the microcanonical average of a quantity, B, by, 
{B)^, we have 

\^ /E,Nf 

If we take B = A, this gives 



Since the probability, P{E), is given by the product of the number of times an 
N-fold way move ends at the energy E times the average inverse acceptance ratio, 
(A~^)^ the condition for an equal-hit ensemble with H{E) = H{E + SE) 
is 

{^~^)E,Nf '^E,5E X aE,5E = i-^'^) E+5E ^ '^E+5E,-5E X aE+5E,-5E- (15) 

This gives the condition on the acceptance rates. 

aE,SE ^ i^~^)E+SE,Nf ^ Te+SE-5E ^^^^ 
a-E+SE-SE {■^~^)E,Nf ^ Te,5E 

This condition can also be fulfilled in many ways, including 

aE,5E = {^~^) E+SE,Nf ^ '^E+SE,-SE (17) 

and 

aE,SE = {^~^) E+5E,Nf I^E,SE- (18) 



6 Equilibration 

At the beginning of a Monte Carlo simulation, there is usually little information 
about either the density of states or the transition matrix. For a canonical 
simulation at a given temperature, this does not cause a problem. However, for 
either a multicanonical simulation or an equal-hit simulation, such information 
is necessary. However, it turns out that the problem of dealing with the lack of 
information is greatly simplified by the transition matrices. 
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The primary approach consists of three stages. 

For the first stage, we start with a random configuration and use one of the 
acceptance rates if it is known, otherwise we arbitrarily set it equal to unity. As 

the simulation in this stage progresses, wc have a changing random walk that 
does not satisfy detailed balance and is biased towards states that have not yet 
been visited. 

For the second stage, we take the approximate (and biased) transition matrix 
from the first stage, and impose the TTT-identities. In the second stage of the 
simulation, we use this approximate transition matrix to define the acceptance 
rates. Although the acceptance rates arc not those of the ensemble that we 
really want to simulate, they do satisfy detailed balance by their construction, 
so that the transition matrix calculated in the second stage is not biased. 

For the third stage, wc use the estimate of the transition matrix (after im- 
posing the TTT-identities) from the second stage to determine the acceptance 
rates. This algorithm represents a good approximation to the ensemble we wish 
to simulate, as well as satisfying detailed balance. The third stage is the main 
part of the simulation; it is where most of the computer time is used. 

A modification of the third stage would be to continually update the tran- 
sition matrix as the simulation progresses. This has the advantage that the 
simulation provides an increasingly accurate representation of the desired en- 
semble. The obvious disadvantage is that, strictly speaking, the algorithm no 
longer satisfies detailed balance. However, we have carried out some very long 
simulations on small systems with this modification, and found that the errors 
decrease with the square root of the length of the simulation, as expected. Note 
that the effect of any bias from the early part of the simulation is expected 
to decrease directly as the length of the simulation, rendering it unimportant. 
Therefore, we believe that the small violation of detailed balance is harmless, 
and will improve the efficiency of the algorithm. 

The two-stage equilibration before the main simulation in the third stage 
requires relatively little computer time, so that this approach turns out to be 
more efficient than the standard multicanonical procedure, as well as being much 
simpler. 

7 Parallel Simulations 

A great advantage of a multicanonical simulation, which is retained by the 
equal-hit simulation, is that information over the entire range of energies can 
be obtained. This means that a single simulation provides the thermodynamic 
behavior at all temperatures - even negative temperatures (also known as the 
antiferromagnetic version of the model). 

However, these advantages come with a price. The relaxation time to equi- 
librate a random walk is proportional to the square of its range. Since the total 
energy range is proportional to the number of particles, N, the relaxation time 
in units of MC steps is proportional to iV^. This does, indeed, turn out to be 
the case for the simulations described above. 
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We can greatly reduce this disadvantage by introducing parallel simulations. 
If we consider a number, of parallel simulations on independent replicas of 
the system, in which each replica is restricted to a range proportional to N/l^ 
the relaxation time for each replica is only proportional to {N/Vf . Taking the 
increased number of simulations into account, we are left with a relaxation time 
in units of MC steps proportional to N'^/l. The relaxation time in units of MC 
steps/site is proportional to N/l. 

Note that this improvement occurs even with a serial machine. Implemen- 
tation on a parallel machine is trivial. 

We can further improve the algorithm by exchanging replicas, as introduced 
in the Replica Monte Carlo method |]l3| ~ fl^ and later rediscovered as parallel 
simulated tempering [^. 

Break up the total energy range into pieces with equal width. Beginning 
with the lowest energies, restrict the first replica to the first two pieces of the 
energy spectrum. The second replica is restricted to the second and third pieces, 
and so on. Replica n and replica n + 1 therefore overlap for half of their range 
of energies. After simulating for a few relaxation times (proportional to N'^/l), 
all neighboring replicas that are in the overlapping half of their ranges are ex- 
changed. Since we are either using a multicanonical simulation with single spin 
flips, or an equal-hit ensemble with the N-fold way, the acceptance probability 
for such exchanges is unity. 

This combination of replica exchange with an equal-hit (or multicanonical) 
simulation provides improvements over both. Although this approach has been 
discussed in terms of transition matrix Monte Carlo, its use is much broader. For 
example, it could easily be applied to multicanonical simulations of biological 
molecules, as we intend to do in future work. 
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